perf(version): wrap Ranges in Rc inside VersionRange#73
Merged
Conversation
Callgrind on rez's 188-case benchmark (post-#71/#72) showed `SmallVec::extend` + `Drop` at ~4 % of cycles, almost entirely from `VersionRange::clone`. Every `Requirement::clone()` (in `extracted_request.clone()`, the per-pair `package_request.clone()` in `reduce_by`, the `req.clone()` and `package_request.clone()` in `Reduction`, etc.) deep-copies the inner `Ranges`'s `SmallVec` of `(Bound, Bound)` segments. After the rest of the perf stack (#66/#67/#68/#70/#71/#72), this is the largest non-amortised allocation cost left. Switch the inner from `Ranges<RerVersion>` to `Rc<Ranges<RerVersion>>`. `Rc<T>::clone` is a refcount bump; `Rc<T>::Hash`/`Eq` defer to the inner `T`, so the derived semantics on `VersionRange` are unchanged. Methods that build a new range (`intersection`, `union`, `complement`, `from_versions`, `span`, `split`, ...) still produce a fresh `Ranges` internally and wrap it with `Rc::new` — the win is on the read / clone path, not the construction path. `as_ranges()` still returns `&Ranges` (via `Rc::deref`). `into_ranges` now uses `Rc::unwrap_or_clone` — falls back to a clone if the `Rc` is shared, but is the consume-the-`VersionRange` API and rare in practice. ## Benchmark (188 cases, release, same machine, two runs) | Stage | Total | Mean | vs rez | |------------------------------------|--------:|-------:|-------:| | Baseline (post-#71/#72), median | ~12.7 s | 68 ms | ~30× | | + this change, run 1 | 11.2 s | 60 ms | 34.1× | | + this change, run 2 | 11.3 s | 60 ms | 33.7× | **~11 % on top of #72**, **~74 % cumulative from main** (43.0 s → 11.2 s, 8.8× rez → 34.1× rez). Differential test (`cargo test … --ignored`): 17.73 s, **188/188 still match rez 1:1**. Predicted 3–5 %. The slightly bigger gain reflects that `VersionRange::clone` cascades into a lot more than just the `SmallVec::extend` it was attributed to in the callgrind exclusive view — it also drove allocator-side work and the matching `Drop`s. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
3 tasks
doubleailes
added a commit
that referenced
this pull request
May 16, 2026
…hemerals The Vec-returning accessors (`Solver::resolved_packages` / `resolved_ephemerals`, `ResolvePhase::solved_variants` / `solved_ephemerals`) force callers that only need to iterate to pay for an intermediate `Vec` plus, for ephemerals, an owning clone of each `Requirement`. Add borrowing-iterator siblings so future consumers can stream without allocating. New API, additive — every existing Vec method keeps its signature: Solver::resolved_packages_iter() -> Option<impl Iterator<Item = Rc<PackageVariant>>> Solver::resolved_ephemerals_iter() -> Option<impl Iterator<Item = &Requirement>> ResolvePhase::iter_solved_variants() -> impl Iterator<Item = Rc<PackageVariant>> ResolvePhase::iter_solved_ephemerals() -> impl Iterator<Item = &Requirement> `iter_solved_variants` still has to clone `self.scopes` internally because `get_solved_variant` takes `&mut self` (it triggers a deferred sort on the variant slice); the saving vs the Vec form is the trailing `.collect()`. `iter_solved_ephemerals` is the bigger win — pure borrow, zero allocation, no per-element clone. Refactor the existing Vec methods to delegate to the iter forms so there's one implementation of the filter logic. ## pyrer wired through `crates/rer-python/src/lib.rs` switches its `SolveResult` build to use `resolved_packages_iter` / `resolved_ephemerals_iter`. Two intermediate `Vec`s skipped per solved result; for ephemerals every entry is now read by reference instead of cloned then stringified. ## Tests - `test_iter_resolved_packages_matches_vec_form` and `test_iter_resolved_ephemerals_matches_vec_form` in `solver::tests` — confirm iter and Vec forms agree on the same input and that the iter form returns `None` on a failed solve. - Existing 5 Python tests for `resolved_ephemerals` still pass (the FFI surface is unchanged; just the implementation under it). ## Verification - `cargo test` (Rust): 41/41 unit tests pass (was 39 + 2 new). - `cargo test … --ignored` (188-case differential): 188/188 still match rez 1:1 in 17.68 s. - `pytest tests/` (all Python): 80/80. ## Perf (188-case benchmark, same machine as README reference) | | Total | Mean | Median | |---|---:|---:|---:| | README reference (post-#73) | 11.35 s | 60 ms | 30 ms | | This branch, pre iter-forms (run 1) | 11.19 s | 60 ms | 28 ms | | This branch, pre iter-forms (run 2) | 11.27 s | 60 ms | 33 ms | | This branch, post iter-forms (run 1) | 10.90 s | 58 ms | 28 ms | | This branch, post iter-forms (run 2) | 11.16 s | 59 ms | 30 ms | Within run-to-run noise; if anything a slight improvement from the avoided allocations. No regression. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
doubleailes
added a commit
that referenced
this pull request
May 16, 2026
…hemerals The Vec-returning accessors (`Solver::resolved_packages` / `resolved_ephemerals`, `ResolvePhase::solved_variants` / `solved_ephemerals`) force callers that only need to iterate to pay for an intermediate `Vec` plus, for ephemerals, an owning clone of each `Requirement`. Add borrowing-iterator siblings so future consumers can stream without allocating. New API, additive — every existing Vec method keeps its signature: Solver::resolved_packages_iter() -> Option<impl Iterator<Item = Rc<PackageVariant>>> Solver::resolved_ephemerals_iter() -> Option<impl Iterator<Item = &Requirement>> ResolvePhase::iter_solved_variants() -> impl Iterator<Item = Rc<PackageVariant>> ResolvePhase::iter_solved_ephemerals() -> impl Iterator<Item = &Requirement> `iter_solved_variants` still has to clone `self.scopes` internally because `get_solved_variant` takes `&mut self` (it triggers a deferred sort on the variant slice); the saving vs the Vec form is the trailing `.collect()`. `iter_solved_ephemerals` is the bigger win — pure borrow, zero allocation, no per-element clone. Refactor the existing Vec methods to delegate to the iter forms so there's one implementation of the filter logic. ## pyrer wired through `crates/rer-python/src/lib.rs` switches its `SolveResult` build to use `resolved_packages_iter` / `resolved_ephemerals_iter`. Two intermediate `Vec`s skipped per solved result; for ephemerals every entry is now read by reference instead of cloned then stringified. ## Tests - `test_iter_resolved_packages_matches_vec_form` and `test_iter_resolved_ephemerals_matches_vec_form` in `solver::tests` — confirm iter and Vec forms agree on the same input and that the iter form returns `None` on a failed solve. - Existing 5 Python tests for `resolved_ephemerals` still pass (the FFI surface is unchanged; just the implementation under it). ## Verification - `cargo test` (Rust): 41/41 unit tests pass (was 39 + 2 new). - `cargo test … --ignored` (188-case differential): 188/188 still match rez 1:1 in 17.68 s. - `pytest tests/` (all Python): 80/80. ## Perf (188-case benchmark, same machine as README reference) | | Total | Mean | Median | |---|---:|---:|---:| | README reference (post-#73) | 11.35 s | 60 ms | 30 ms | | This branch, pre iter-forms (run 1) | 11.19 s | 60 ms | 28 ms | | This branch, pre iter-forms (run 2) | 11.27 s | 60 ms | 33 ms | | This branch, post iter-forms (run 1) | 10.90 s | 58 ms | 28 ms | | This branch, post iter-forms (run 2) | 11.16 s | 59 ms | 30 ms | Within run-to-run noise; if anything a slight improvement from the avoided allocations. No regression. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Callgrind on the rez 188-case benchmark (post-#66/#67/#68/#70/#71/#72) put `SmallVec::extend` + `Drop` at ~4 % of cycles, almost entirely from `VersionRange::clone` deep-copying the inner `Ranges`'s `SmallVec` of `(Bound, Bound)` segments. `VersionRange` clones happen everywhere on the hot path — `Requirement::clone` (extract loop, per-pair `reduce_by`, `Reduction` construction, ...) — so they accumulate.
Switch the inner from `Ranges` to `Rc<Ranges>`:
Single-file change in `rer-version`.
Benchmark (188 cases, release, same machine, two runs)
~11 % additive on top of #72. Predicted 3–5 % — came in higher because `VersionRange::clone` cascades into more than just the `SmallVec::extend` it shows up as in the exclusive view; it also drives allocator-side work and the matching `Drop`s.
Cumulative from main: 43.0 s → 11.2 s, 8.8× rez → 34.1× rez (−74 %). Differential test got the same lift: 188/188 still match rez 1:1, in 17.73 s (was 20.70 s).
Correctness
🤖 Generated with Claude Code